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(54) Disk control system and method 

(57) In a disk control system having a RAID control- 
ler (16) for continuously writing data on a data stripe 
composed of a plurality disk apparatus (180), in re- 
sponse to a wcite request, data blocks are sequentially 
written on empty areas (34. 40. 51) of a write target data 
stripe on the plurality of disks (180) in such a manner 
that at least orie data block is written at a time. Further, 
in response to'the write request, logical addresses hav- 
ing address values prior to address translation are writ- 
ten on logical address log areas (18b2) on the plurality 
of disks (180). as logical-address log information. An up- 
per file system (50) is notified that the write has been 
completed after the data and the logical-address log in- 
formation have been completely written. 
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Description 

[0001] The present invention relates to a disk control 
system and method, and in particular, to a disk control 
system and method that continuously writes write-re- 5 
quested data to a stripe on a disk using only a main 
memory and a disk apparatus and without using any ex* 
elusive non-volatile memory constituting a write buffer 
and an address mapping table and which is convention* 
ally used for a RAID control device. 
[0002] A method of writing ail data to a physical stripe 
as one continuous area of random writes as shown, for 
example, in Jpn. Pat. Appln. KOKAI Publication No. 
6-214720 and Jpn. Pat; Appln. KOKAI Publication No. 
11-53235 has been proposed as a write method for a 
disk control system of a RAID configuration. In the prior 
art. however, write data are written to a non-volatile 
memory (hereinafter referred to as an "NvRAM") or a 
volatile memory, and once write data amounting to one 
stripe have been provided, they are written to the phys- 
ical stripe on the disk apparatus. Further, the NvRAM 
stores an address mapping table in which logical ad- 
dresses from an upper file system are translated into 
physical addresses on the disk. 
[0003] When the data are thus written to the NvRAM, 
even if the system is shut down before the data are writ- 
ten to the disk apparatus, the write can be completed by 
referencing the NvRAM after system reboot and correct- 
ly writing the data in the NvRAM which have not been 
written yet, to the disk apparatus (no write data are lost). 
[0004] That isr. as long as the data write to the NvRAM 
has been completed, no write data are lost. Thus, once 
the write data have been written to the NvRAM. the con- 
ventional disk control apparatus notifies, in response to 
the write request, the host that the write has been "com- 
pleted". 

(0005) The NvRAM, however, must be provided in an 
I/O card (hardware) installed in the system, thus disad- 
vantageous^ requiring corresponding costs. Other 
problems include the compatibility of the system with a 
host computer or other equipment, the needs for main- 
tenance, and the like. 

[0006] It is an objet of the present invention to provide 
a disk control system and method that continuously 
writes write-requested data to a stripe on a disk using 
only a main memory and a disk apparatus and without 
using any NvRAM. 

[0007] To achieve this object, the present invention 
provides a disk control system that responds to a write 
request from an upper file system to translate logical ad- 
dresses into physical ones and then continuously write 
write-requested data to a data stripe as a write area 
composed of a plurality of disk apparatuses, the system 
being characterized by comprising means for respond- 
ing to the write request to sequentially write data blocks 
to an empty area of an assigned target data stripe of 
data areas provided on the plurality of disks, in such a 
manner that at least one data block is written at a time, 



means for responding to the write request to write the 
logical addresses from the upper file system to data 
managing areas provided on the plurality of disks, as 
logical-address log information, and means for notifying, 
in response to the write request from the upper file sys- 
tem, the upper file system that the write has been com- 
pleted, after the data and the logical-address log infor- 
mation have been completely written. 
[0008] According to this aspect of the present inven- 
tion, instead of the non-volatile memory, a data manag- 
ing area 18b, a data area 18c. and the like provided on 
the disk can be used to process the request as writes to 
an area in which random writes are physically contigu- 
ous to one another. Thus, the adverse effects of seeking 
and rotation waiting operations are reduced, thereby 
achieving a fast write process without using any non- 
volatile memory. 

[0009] Further, the present invention provides a disk 
control system that responds to a write request from an 
upper file system to translate logical addresses into 
physical ones and .then continuously write write-re- 
quested data to a data stripe as a write area composed 
of a plurality of disk apparatuses, the system being char- 
acterized by comprising means for writing a plurality of 
block data corresponding to a plurality of write requests, 
to a write buffer provided in a main memory, data write 
means for responding to the plurality of write requests 
to simultaneously write all the plurality of data blocks 
stored in the write buffer to an empty area of an assigned 
target data stripe of data areas provided on the plurality 
of disks, log write means for simultaneously writing the 
logical addresses from the upper file system corre- 
sponding to the plurality of block data, to data managing 
areas provided on the plurality of disks, as logical-ad- 
dress tog information, and means for notifying, in re- 
sponse to the write requests from the upper file system, 
the upper file system that the writes have been complet- 
ed, after the data and the logical-address log information 
have been completely written. 
[0010] According to this aspect of the present inven- 
tion, instead of the non-volatile memory, the data man- 
aging area 18b. the data area 18c. and the like provided 
on the disk can be used to process the request as writes 
to an area in which random writes are physically contig- 
uous to one another. Further, the plurality of write re- 
quests can be processed as one data write process and 
one write of the logical-address log information, thereby 
reducing the number of write processes required and 
increasing a write size. Consequently, the total over- 
head of the writes to the disk decreases to thereby im- 
prove the throughput of the write process. 
[0011] Moreover, the present invention provides a 
disk control system that responds to a write requestfrom 
an upper file system to translate logical addresses into 
physical ones and then continuously write write-re- 
quested data to a data stripe as a write area composed 
of a plurality of disk apparatuses, the system being char- 
acterized by comprising means for responding to the 
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write request to sequentially write data blocks to an 
empty area of an assigned target data stripe of data ar- 
eas provided on the plurality of disks, in such a manner 
that at least one data block is written at a time, means 
for responding to the write request to write the logical 
addresses from the upper file system, write data sizes, 
and checksums of data written to logical-address area 
provided on the plurality of disks, as logical-address log 
information, and means for notifying, in response to the 
write request from the upper file system, the upper file 
system that the write has been completed, after the data 
and the logical-address tog information have been com- 
pletely written. 

[0012] According to this aspect of the present inven- 
tion, if the system fails during a write to the data area of 
the disk, it is checked whether or not a checksum value 
being written to the logical-address log area of a stripe 
being subjected to the write process equals a checksum 
value determined from data from the data area. If these 
checksum values are equal, it is determined that the da- 
ta write has been completed. Accordingly, the data are 
treated as valid, and the remaining part of the process 
(registration in the address mapping table and the like) 
is executed, thus enabling an efficient troubleshooting 
process. 

[0013] Further, the present invention provides a disk 
control system that responds to a write request from an 
upper file system to translate logical addresses into 
physical ones-, and then continuously write write-re- 
quested data to a data stripe as a write area composed 
of a plurality of bisk apparatuses, the system being char- 
acterized by comprising means for responding to the 
write request to write the logical addresses from the up- 
per file system, write data sizes, and checksums of data 
written to logical-address areas provided on the plurality 
of disks, as logical-address log information, means for 
responding to the write request to sequentially write data 
blocks to an empty area of an assigned target data stripe 
of data areas provided on the plurality of disks, in such 
a manner that at least one data block is written at a time, 
and means for notifying, in response to the write request 
from the upper file system, the upper file system that the 
write has been completed, after the data and the logical- 
address log information have been completely written. 
[0014] According to this aspect of the present inven- 
tion, for each write request, once the logical-address log 
information and the write data a have been completely 
written to the logical-address log area, an OS file system 
can be notified that the writes have been completed. 
That is. the apparatus can respond to the OS file system 
quickly. 

(0015) Moreover, the present invention provides a 
disk control system that responds to a write request from 
an upper file system to translate logical addresses into 
physical ones and then continuously write write-re- 
quested data to a data stripe as a write area composed 
of a plurality of disk apparatuses, the system being char- 
acterized by comprising means for responding to the 
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write request to record flags indicative of validity or in- 
validity, stripe ID numbers, and write time stamps for fi- 
nal data in header sections of logical address log areas 
provided on the plurality of disks, and write time stamps 
5 for at least one block data processed by the write re- 
quest, at least one logical address, and at least one 
checksum to entry sections of the logical-address log 
areas as logical-address log information, means for re- 
sponding to the write request to sequentially write data 
w . blocks to an empty area of an assigned target data stripe 
of data areas provided on the plurality of disks, in such 
a manner that at least one data block is written at a time, 
and means operative if the system fails during the write, 
to check whether or not a checksum value being written 
/5 to the logical-address log area of a stripe being subject- 
ed to the write process and for which a valid flag has 
been set equals a checksum value determined from da- 
ta from the data area, and to treat the data as valid if 
these checksum values are equal, while determining 
20 that the write has not been completed and discarding 
the data if these checksum values are unequal. 
[0016] According to this aspect of the present inven- 
tion, if the system fails during a write, it is checked 
whether or not a checksum value being written to the 
25 logical-address log area of a stripe being subjected to 
the write process and for which a valid flag has been set 
equals a checksum valu e determined from data from the 
data area. If these checksum values are equal, the data 
are treated as valid. If the checksum values are not 
30 equal, it is determined that the write has not been com- 
. pleted. and the data are discarded, thus enabling an ef- 
ficient troubleshooting process. 
[0017] According to this aspect of the present inven- 
tion, in a disk system employing a log-type data block 
35 managing system, the write address information from 
the file system is written to the fixed area on the disk, 
so that a cache of a RAID controller can be effectively 
used to reduce the time required for writes. By introduc- 
ing the checksum into the management of the address 
40 mapping table, its consistency can be checked after a 
troubleshooting process or the like following an unex- 
pected system down, thereby making the system more 
reliable. 

[0018] This summary of the invention does not nec- 
•*5 essarily describe all necessary features so that the in- 
vention may also be a sub-combination of these de- 
scribed features. 

[0019] The invention can be more fully understood 
from the following detailed description when taken in 
so conjunction with the accompanying drawings, in which: 

FIG. 1 is a block diagram showing the configuration 
of a computer system to which a disk control system 
of the present invention has been applied; 
55 FIG. 2 is a block diagram showing the principle of 
the control of writes to a disk array using a RAID 
speed-up driver according to the present invention: 
FIG. 3 is a block diagram showing the relationship 
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between the RAID speed-up driver and the disk as 
well as the control of writes using the RAIO speed- 
up driver according to a first embodiment of the 
present invention; 

FIG. 4 is a flow chart showing a main routine for a 5 
write process according to the first embodiment of 
the present invention: 

FIG. 5 is a flow chart showing the procedure of I/O 
completing processes 1 and 2 according to the first 
embodiment of the present invention; io 
FIG, 6 is a flow chart showing the procedure of an 
I/O completing process 3 according to the first em- 
bodiment of the present invention; 
FIG. 7 is a flow chart showing a main routine for a 
write process according to a third embodiment of '5 
the present invention; 

FIG. 8 is a flow chart showing the procedure of a 
timer function according to the third embodiment of 
the present invention; 

FIG. 9 is a flow chart.showing the procedure of I/O 20 
completing processes 1 and 2 according to the third 
embodiment of the present invention; 
FIGS. 10Aand 10B illustrate a diagram showing the 
organization of write requests Reqi and how the re- 
quests are stored in a pending list, according to the 25 
third embodiment of the present invention; 
FIG. 11 is a diagram showing a data stripe and a 
write to a logical-address log area according to the 
third embodiment of the present invention; 
FIG. 12 is a diagram showing the data stripe and a 30 
write to the J logical-address log area according to a 
fourth embodiment of the present invention; 
FIG. 13 is i diagram showing the data stripe and a 
write to the logical-address log area according to a 
fifth embodiment of the present invention; 35 
FIG. 14 is a flow chart showing a main routine for a 
write process according to the fifth embodiment of 
the present invention; 

FIGS. 15A and 15B illustrate a flow chart showing 
the procedure of an I/O completing processes 1 and w 
2 according to the fifth embodiment of the present 
invention; 

FIG. 16 is a flow chart showing a main routine for a 
write process according to a sixth embodiment of 
the present invention; 4 $ 
FIGS. 17A to 17D illustrate a diagram showing the 
organization of logical-address logs LAO to LA3 ac- 
cording to the sixth embodiment of the present in- 
vention; and 

FIG. 18 is a diagram showing how TAG information so 
is written to the data stripe according to the sixth 
embodiment of the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 

55 

[0020] First, various terms used herein will be de- 
scribed. 



(Data Stripe) 

[0021] The term "data stripe" means a unit of data 
"collectively written" by a disk control system (this oper- 
ation is also referred to as a "batch write"). The data 
stripe is a continuous area on disk partitions and has a 
size equal to an integral multiple of the size of a stripe 
managed by a RAID controller. For the RAIDS configu- 
ration, this size can be set as a parity group to correct 
what is called "RAIDS write penalty", thus substantially 
improving the performance of the system. The stripe is 
composed of a plurality of data blocks. 

(Stripe Number) 

[0022] The term "stripe number" means the serial 
numbers of the physical strips arranged on the parti- 
tions, 

(Logical Block Number = Logical-Address Number) 

[0023] The term "logical block number" refers to data 
block numbers on the partitions as viewed from an upper 
file system, tn a disk control system, when the upper file 
system requests an access, it uses logical block num- 
bers, which are virtual. The logical block numbers are 
associated with physical block numbers (arranged on 
the physical partitions) by an "address mapping table" 
managed by the disk control system, A byte offset value 
(address) on the partition is determined on the basis of 
(physical block number) x (block size [bytes]). 

(Address Mapping Table (AMT)) 

[0024] In the disk control system, when the upper file 
system requests an access, it uses logical block num- 
bers, which are virtual. The logical block numbers are 
associated with the physical block numbers (arranged 
on the physical partitions) by the "address mapping ta- 
ble", managed by the disk control system. In the embod- 
iments of the present invention, the address mapping 
table is located on a target partition and is assigned with 
an area different from data sections. The address map- 
ping table has the physical block numbers registered 
therein, which correspond to the logical block numbers. 
When a new data block is to be written, the physical 
block number (address on the disk) to which that block 
is written is registered in the entry of the corresponding 
logical block number. On the other hand, when a data 
block is to be referenced, the value of the entry having 
the logical address of that data block as an index is de- 
termined and used as a physical block number on the 
disk to determine an actual address for reference. 
[0025] An embodiment of the present invention will be 
described below with reference to the drawings. 
[0026] FIG. 1 shows the configuration of a computer 
system to which a disk control system according to one 
embodiment of the present invention has been applied. 
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This computer system is used, for example, as a server 
(PC server) that can have a plurality of CPUs (#1. #2. 
and #N) 11 mounted therein. These CPUs 11 are con- 
nected to a bridge 12 via a processor bus 1 as shown 
in the figure. The bridge 1 2 is a bridge LSI for connecting 
the processor bus 1 and a PCI bus 2 for bidirectional 
communications, and has a built-in memory controller 
for controlling a main memory 1 3. The main memory 1 3 
has an operating system (OS), an application program 
to be executed, a driver, and the like loaded therein. Fur- 
ther, in the present invention, the main memory 13 has 
a driver work area 17 provided with a write buffer (col- 
lective-write buffer) 171. an address mapping table 
(hereinafter referred to as an "AMT cache") 172. and a 
stripe managing table 173. 

[0027] The write buffer 171 is a data buffer for accu- 
mulating write data blocks therein. Once, for example, 
write data blocks amounting to one physical stripe have 
been accumulated in the write buffer 171. a batch write 
to the disk array 18 is started. The present invention, 
however, is not limited to this operation. If a collective 
write is to be executed, the physical stripe will be a unit 
of "collective write" and will be composed of a series of 
contiguous areas on partitions formed in the entire stor- 
age area of the disk array 18. Each physical stripe has 
a size equal to an integral multiple of the size of a stripe 
unit managed by a RAID controller 16. 
[0028] The AMT cache 172 stores address mapping 
information indicative of the correspondences between 
a plurality of logical block numbers constituting a logical 
address space'and corresponding physical block num- 
bers each indicative of a physical location on the disk 
array 18 in which the data block designated by the cor- 
responding logical block number is present When the 
OS file system requests an access, it uses logical block 
numbers, which are virtual. The logical block numbers 
are associated with the physical block numbers (ar- 
ranged on the physical partitions) by the AMT cache 
172. Further, a byte offset value from the leading loca- 
tion of the partition is determined on the basis of a phys- 
ical block number x a block size (bytes). 
[0029] Moreover, the AMT cache 172 has a plurality 
of entries corresponding to the respective logical block 
numbers. When a new data block is to be written, the 
physical block number (physical address) to which that 
block is actually written is registered in the entry corre- 
sponding to the write-requested logical block number. 
On the other hand, when a data block is to be read, a 
physical block number is determined from the entry cor- 
responding to the read-requested logical block number, 
and a data block is read from the physical location on 
the disk partition designated by that physical block 
number (physical block number * block size). 
[0030] The stripe managing table 173 manages infor- 
mation on the logical stripes, and this information is used 
for a process of relocating data and other processes. 
[0031] The PCI bus 2 has a RAID controller 16 con- 
nected thereto. The disk array 18. composed of a plu- 



rality of disk apparatuses controlled by the RAID con- 
troller 1 6, is used for recording various user data and for 
other purposes. 

[0032] The disk array 18 functions as a disk array of. 

5 for example, a RAIDS configuration under the control of 
the RAID controller 16. In this case, the disk array 18 is 
composed of N + 1 (in this case, five (DISKO to DISK4)) 
disk apparatuses including N for storing data and addi- 
tional one for storing parities. These N + 1 disk appara- 

w tuses are grouped and used as a single logical disk 
drive. 

[0033] The grouped disk apparatuses are assigned 
with physical stripes (parity groups) each composed of 
data and their parity, and the parity of each physical 
is stripe is sequentially shifted among the N + 1 disk ap- 
paratuses. For example, the parity of a group of data on 
a physical stripe SO, assigned to the same location of 
the disks DISKO to DISK3. is recorded on the corre- 
sponding stripe in the disk Dl SK4 . Further, the parity cor- 
20 responding to data on a physical stripe S1 is recorded 
on the corresponding stripe in the disk DISK3. By dis- 
tributing the parities of the physical stripes among the N 
+ 1 disk apparatuses, accesses are prevented from con- 
centrating on the parity disk. 
25 [0034] The driver work area 17 of the main memory 
13 is used to implement a RAID speed-up driver. The 
RAID speed-up driver is used to improve the perform- 
ance of writes to the disk array 18. In this embodiment, 
the RAID speed-up driver is implemented using a driver 
30 program incorporated in the OS and the driver work area 
17 and without modifying an OS file system 50. Then, 
the principle of the control of writes to the disk array 1 8 
using the RAID speed-up driver will be described with 
reference to the drawings. 
35 [0035] A RAID speed-up driver 100 is provided as a 
filter driver located between the OS file system 50 and 
the physical disks (disk array 18). The RAID speed-up 
driver 100 is responsive to a write request from the file 
system 50 to execute the functions of (1) carrying out 
40 address mapping to set a next empty area of a write tar- 
get stripe as a write target physical address (address in 
the logical partitions on the RAID) and then carrying out 
an actual write, and (2) registering the write target ad- 
dress of the write request in the AMT cache 172. 
45 [0036] The RAID speed-up driver 100 is responsive 
to the write request from the OS file system 50 to trans- 
late the requested logical addresses into physical ones 
using the AMT cache 172 and write transformed write 
data to a data stripe formed in the disk array 18 (this 
so also applies to a series of write requests with discontin- 
uous target addresses). A continuous area constituting 
contiguous addresses on the disk constituting this data 
stripe has an appropriate alignment and size to enable 
efficient writes to the disk. The alignment and size de- 
55 pend on a driver directly operating the disk and on the 
RAID controller 16. In particular, if the write target disk 
array 18 has the RAID 5 configuration, the continuous 
area corresponds to a "parity group" or its integral mul- 
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tiple. 

[0037] With a write method using the RAID speed-up 
driver 100. instead of determining data write locations 
according to the logical addresses contained in the write 
requests from the host computer (in this embodiment, 
the OS file system 50). the write data are sequentially 
accumulated in the order designated in the write re- 
quests from the host computer to form a large data block 
composed of a plurality of write data blocks. Then, the 
large data block is collectively and sequentially written 
to an empty area of the disk array 1 8 from top to bottom. 
A unit of the "collective write* 1 is a physical stripe. That 
is. one free physical stripe is generated for each "collec- 
tive write", and the data blocks corresponding to one 
physical stripe are sequentially written to this physical 
stripe. Thus, random accesses can be converted into 
sequential ones to substantially improve the write per- 
formance. The present invention, however, is not limited 
to the above described "collective write", but the write 
process may be executed for each write request from 
the OS file system 50 or each group of a plurality of write 
requests. 

[0038] FIG. 2 shows a write operation based on the 
above described "collective write". This is an example 
where the block data size of write data transmitted by 
the OS file system 50 is 2 KB. the data size of one stripe 
unit is 64 KB. and the data size of one physical stripe 
(parity group) is 256 KB (64 KB x 4). The 2-KB write 
data block is obtained by the RAIO speed-up driver 100. 
incorporated in the OS, and is accumulated in the write 
buffer 171 in the driver work area 17 of the main memory 
13. : 

[0039] Essentially, once 256 KB of data blocks (2 KB 
^ 128 data blocks) have been accumulated in the write 
buffer 171 in the driver work area 17. they are collec- 
tively written to one physical stripe in the disk array 18 
at a time. In this case, the RAID controller 16 can gen- 
erate a parity only from the 256 KB of write data blocks, 
thereby eliminating the needs for processes of. for ex- 
ample, reading old data in order to calculate the parity. 
Consequently, the well-known RAIDS write penalty can 
be reduced. 

(First Variation of the Embodiment) 

[0040] Now. an operation of executing a write to the 
disk array 18 using the RAID speed-up driver 100 will 
be described with reference to FIG. 3. This figure shows 
how the RAID speed-up driver 100 processes write re- 
quests Req1. Req2. Req3. Req4. ... transmitted by the 
upper file system 50. 

[0041] In FIG. 3. a write/reference process is shown 
to be executed on one disk, but a disk 180 in FIG. 3 is 
logical and a RAID configuration is actually used which 
is composed of a plurality of physical disks (DISK0 to 
DISK4) as shown in FIG. 1 or 2. Then, the entire disk 
180 constitutes one partition. 

[0042] In FIG. 3. the partition of the disk 180 is divided 



into the address mapping table (hereinafter referred to 
as the "AMTarea - ) 18a. a managing data area 18bcom- 
posed of a stripe ID log area 18b1 and a logical-address 
log area 18b2. and a data area 18c. 

5 [0043] For example, it is assumed that the physical 
data stripe corresponding to the current write target is a 
"stripe 34" in the data area 18c. Once write data have 
been assigned to all the areas of the write target data 
stripe 34. the RAID speed-up driver 1 00 assigns a new 

10 free stripe as the next write target. A subsequent write 
process for the write request data is executed on the 
areas of the newly assigned stripe. 
[0044] In response to the write requests Req1 . Req2. 
Req3. Req4, ... from the OS file system 50. the RAID 

'5 speed-up driver 100 writes the data stored in the write 
buffer 171 to the stripe 34 in the data area 18c of the 
disk 180. and registers the actual physical addresses 
that have undergone the writes, in the AMT cache 1 72. 
Further, in response to a read request from the OS file 

20 system 50. the RAID speed-up driver 1 00 references the 
AMT cache 1 72 to determine the physical address cor- 
responding to a designated logical address, and returns 
data correspond ing to the result of a read of that physical 
address from the disk 180, to the OS file system 50. 

25 [0045] Each data stripe in the data area 18c has a 
"TAG area* TAG in the last block. When data is written 
to each stripe. (1) the times when corresponding writes 
were executed on that data stripe (or time stamps TS 
as sequence numbers for the writes) and (2) the logical 

30 addresses of valid data blocks of the data stripe are re- 
corded and saved in the TAG area. 
[0046] Further, during a write to a data stripe, the "log- 
ical address" of each of the data blocks written to the 
data stripe, currently undergoing the write, is recorded 

35 in the logical-address log area 18b2 of the managing 
data area 18b. 

[0047] Moreover, the IDS of stripes selected as the 
write target data stripe are recorded in the stripe ID log 
area 18b1 of the managing data area 18b in a time se- 

40 ries manner. This example shows that the stripe ID "31" 
underwent a write before the stripe ID "34". 
[0048] FIG. 3 shows, in an upper part thereof, the driv- 
er work area 17, provided on the main memory 13. This 
figure shows that the area 1 7 includes sub-areas for the 

<»5 write buffer 171 and the AMT cache 172. The AMT 
cache 1 72 has part of the address mapping table, stored 
in the AMT area 18a, and is accessed during system 
operation. If an access is requested in connection with 
a logical address that is not registered in the AMT cache 

so 172, a corresponding content is read from the address 
mapping table in the AMT cache 18a and registered in 
the AMT cache 172. If the contents of the AMT cache 
have been updated or the system is to be shut down, 
the contents of the AMT cache 172 are written back to 

55 the address mapping table in the AMT area 18a. 

[0049] The write buffer 171 is an area in which rele- 
vant data are first buffered in response to a write request 
from the OS file system 50. One write buffer has a size 
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equal to one data stripe .and FIG. 3 shows only one write 
buffer due to the limited space in the figure. A plurality 
of write buffers, however, are provided. In the first vari- 
ation of the embodiment, the write buffer 171 is not nec- 
essarily required. For example, a buffer memory may be 
provided which stores one or more data blocks. 
[0050] The AMT cache 172 is an area in which the 
address mapping table in the AMT area 18a on the disk 
is cached and stored. To reference or change the AMT 
cache 172. the RAID speed-up driver 100 loads data 
from the AMT area 18a into the AMT cache 1 72 in such 
a manner that a predetermined fixed size of data (for 
' example. 4 KB) are loaded at a time, and references or 
changes them on the main memory 13. If the cached 
AMT cache 172 has been changed, it is written back to 
the original AMT area 1 8a. 

[0051] Next, an operation of writing data to the parti- 
tion of the disk 180 using the RAID speed-up driver 100 
will be described with reference to FIG. 4. In the follow- 
ing description, the write target data stripe is the "stripe 
34". 

[0052] FIG. 4 shows a process with a main routine for 
the write process. The OS file system 50 inputs a write 
request Reqi (i = 1 . 2. 3. 4, ...) composed of write block 
data B0 to Bn starting with a logical address addri and 
each consisting of 2 KB (step S401). 
[0053] The write request Reqi is checked to see 
whether or not the assigned write target stripe has an 
empty area (step S402). If it has no empty area, one of 
the free stripes on the disk is selected (step S403). 
Then, the ID of this stripe (in this example, the stripe 34) 
is defined as Ipk, which is then written to the stripe ID 
log area 1 8b1 of the data managing area 1 8b as the next 
entry (step S404). The stripe ID log area 18b1 has the 
IDs of stripes recorded therein in a time series manner, 
the stripes having been selected as the write target. 
[0054] Then, the RAID speed-up driver 100 assigns 
the logical-address log area 1 8b2 of the managing data 
area 18b (step S405). The logical-address log area 
18b2 is used to memorize the data logical addresses of 
the data of the write request Reqi when the data are writ- 
ten to the data stripe 34. The logical-address log area 
18b2 has the ID of the write target data stripe recorded 
in its head, and subsequently stores data indicating 
which data, corresponding to a logical address, is re- 
tained by each of the physical blocks constituting the 
stripe. If the write target data stripe 34 has already been 
assigned at step S402. steps S403, S404. and S405 are 
omitted and the process proceeds to step S406. 
[0055] Then, the RAID speed-up driver 100 assigns 
the write request Reqi to the empty area of the write tar- 
get data stripe 34 by dividing the request into 2-KB 
blocks. This operation continues until all the empty are- 
as of the data stripe 34 has been used up or all the 
blocks of the write request Reqi have been assigned 
(step S406). 

[0056] Then, the RAID speed-up driver 1 00 issues an 
I/O request for a data write to the data stripe 34 (step 



S407). At this time, the data that can be written to the 
data stripe are grouped into one on the write buffer 171. 
before the I/O request is issued. For example, the write 
request Reqi in FIG. 3 requires two 2-KB block data to 
5 be written to the data stripe, but the I/O request is issued 
by considering these data as one write request. This en- 
ables efficient writes based on the characteristics of the 
disk. Furthermore, an I/O request is issued which re- 
quires information on the logical addresses of the write 
10 data to be written to the logical-address log area 18b2 
assigned at step S405. In this example, the stripe ID = 
34 is written to the head of the logical-address log area 
18b2. As the logical addresses V of subsequently writ- 
ten data, V = 100 and 101 is written for a write of the 
/s two-block data of the write request Reqi. and similarly 
V = 76. 77. and 78 is written for the three-block data of 
the request Req2. V = 60 is written for the one-block 
data of the request Req3, and V = 79. 80. 81 . and 82 is 
written for the four-block data of the request Req4 (step 
20 S408). The I/O requests issued at steps S407 and 408. 
described above, are processed asynchronously. 
[0057] Then, it is checked whether or not the write tar- 
get data stripe 34 contains any empty area (step S409). 
If it is determined to contain no empty area, that is. write 
25 data amounting to the total capacity of the write target 
data stripe 34 have been provided, an I/O request for a 
write of TAG information (a table for the logical block 
numbers of the blocks written to the target data stripe 
34) is issued (step S410). The TAG information may in 
30 principle be arranged anywhere on the disk 180, but in 
this variation of the embodiment, is present as one block 
of the data stripe 34. The process of writing the TAG 
information at step S410 is executed asynchronously. 
Then. TAG information on the data blocks written to the 
35 write target data stripe 34 is registered in the AMT cache 
172. which is then updated (step S411). This process is 
also executed asynchronously because the operation of 
the AMT cache 1 72 may require an I/O request. On the 
other hand, if the write target data stripe 34 is deter- 
•fo mined to contain any empty area at step S409. steps 
S410 and S411. described above, are omitted, and the 
process proceeds to step S412. 
[0058] Finally, the RAID speed-up driver 100 deter- 
mines whether or not write I/O requests have been is- 
45 sued for all the data blocks in the request Reqi(step 
S412). If there still remain any data to write, the process 
returns to step S402 to assign a new stripe and make 
write requests as described above. 
[0059] If write I/O requests have been issued for all 
so the data blocks of the write request Reqi at step S412. 
described above, the disk write process is completed. 
[0060] FIG. 5 is a flow chart showing the operation of 
a process of completing the I/O requests issued at steps 
S407 and S408. described above (this is normally a 
55 completion interrupting process). An I/O completing 
process 1 corresponds to step S407. whereas an I/O 
completing process 2 corresponds to step S408. 
[0061] Once the data has been written (I/O complet- 
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ing process 1). the RAID speed-up driver 100 checks 
whether or not logical address log numbers correspond- 
ing to the completed data write have been written (step 
S50 1 ). On the other hand , once all of the logical address 
log has been written (I/O completing process 2). a data 
write process corresponding to the completed write of 
the logical address log has been completed (step S510). 
If these writes have been completed, it is checked 
whether or not the writes of all the data and logical ad- 
dress log relating to the original write request Reqi have 
been completed (step S502). If these writes have been 
completed, the RAID speed-up driver 100 notifies the 
OS file system 50 that all the writes to the disk for the 
write request Reqi have been executed (step S503). 
That is, once all. the write blocks have been written to 
the data stripe 34 and the logical-address information 
for these blocks have been written to the logical-address 
log area 18b2. the RAID speed-up driver 100 notifies 
the OS file system 50, having issued the write request 
Reqi, that the request has been completely completed. 
[0062] FIG. 6 is a flow chart showing the operation of 
a process of completing a TAG write I/O request de- 
scribed in step S410 in FIG. 4. In the process of com- 
pleting a TAG write I/O request (I/O completing process 
3). the logical-address log area assigned to the stripe 
34. for which the TAG write I/O request has been com- 
pleted, is released (step S600) to complete the write of 
the TAG information. 

[0063] The first variation of the embodiment has been 
described. Data are conventionally written to physically 
separate areas'on the disk in response to a random write 
request, so that the performance decreases due to disk 
seeking and rotation waiting operations. According to 
the present invention, instead of the non-volatile mem- 
ory, the AMT area 1 8a (including the AMT cache 1 72 on 
the main memory), managing data area 18b, and data 
area 18c, provided on the disk, can be used to execute 
the write as one to an area in which random writes are 
physically contiguous to one another. Consequently, the 
adverse effects of seeking and rotation waiting opera- 
tions are reduced, thereby achieving a fast write proc- 
ess. For disks of the RAIDS configuration, if stripe 
boundaries and sizes are adjusted so as to execute 
writes for respective parity groups, once all the writes to 
the stripe have been executed, the parity group is pro- 
vided on the cache memory of the RAID controller. Con- 
sequently, no discrete process is re quired for calculating 
the parity, thus making it possible to eliminate what is 
called the "RAIDS write penalty" to thereby achieve a 
fast write process. 

[0064] Further, if an unexpected system down aborts 
the process to before the series of data writes are com- 
pleted, the logical-address log information recorded in 
the logical-address log area 18b2 can be used to exe- 
cute a data recovery process after system reboot. The 
logical-address log information becomes unnecessary 
when the relationship between the logical and physical 
addresses of all the write data blocks written to the write 



target data stripe 34 is registered as the TAG informa- 
tion. 

(Second Variation of the Embodiment) 

5 

[0065] A second variation of the embodiment of the 
present invention will be described. 
[0066] In the first variation of the embodiment shown 
in FIG. 4. the assignments for the logical-address log 

10 area 18b2 have been executed when assigning the 
empty area of the stripe at steps S403 to S405. Fixed 
one logical-address area 18b2 may be provided as 
shown in FIG. 3 or a plurality of logicat-addresslog areas 
18b2 may be provided. In the second variation of the 

is embodiment, the write process is executed using a plu- 
rality of logical-address log areas 18b3. 
[0067] First, according to this variation of the embod- 
iment, when a new logical-address log area is assigned 
at step S405 in FIG. 4, one of a plurality of logical-ad- 

20 dress log areas (18b2-1 to 18b2-m) is selected which 
has been most recently used and released. 
[0068] To achieve this, an arrangement is provided for 
managing the logical-address log area 18b2 used based 
on the LRU method. Stack-like free-source managing 

25 means (not shown) is also provided which relates to ID 
numbers indicative of the logical-address log areas 
(18b2-1 to 18b2-m). During system initialization, the IDS 
of all the logical-address log areas are registered in the 
free-resource managing means. 

30 [0069] Once the TAG write has been completed at 
step S410 in FIG. 4. the RAID speed-up driver 100 re- 
leases the logical-address log area assigned to that 
stripe (step S600 in FIG. 6). In this case, the ID number 
(assumed to be 18b2-1) indicative of the released logi- 

35 cal-address log area is placed at the top of a stack of 
the free source managing means. At step S405 in FIG. 
4. if a new logical-address log area is required, the ID 
number accumulated at the top of the stack is read, and 
the logical address log area 18b2-1 corresponding to 

•»o that address is obtained. That is. the most recently re- 
leased logical-address log area 1 8b2-1 is obtained. This 
management enables the constant use of the "logical- 
address log area that has been most recently subjected 
to a write process and then released". 

4$ [0070] In general, with a RAID controller having a 
cache mounted therein and backed up by batteries, 
when the same areas undergo repeated writes, the 
cache retains the data for a long time. Accordingly, sub- 
sequent write processes with respect to these areas be- 
so come faster. In the second variation of the embodiment, 
this characteristic is used to manage the logical-address 
log area 18b-2 so as to use the same logical address 
log area 18b2-1 as often as possible, thus reducing write 
overheads. 

55 [0071] The effects of the second variation of the em- 
bodiment will be described in further detail. 
[0072] Data writes are efficient because the data are 
written to a continuous area corresponding to a stripe 
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as a parity group. Writes of logical-address log informa- 
tion, however, are not always executed on the basis of 
the parity group and are thus inefficient. The completion 
notification is issued in response to the write request af- 
ter both the "write of data - and the "write of logical-ad- 
dress log information" have been completed. Thus, if the 
write of logical-address log information is delayed, the 
"entire write process" is delayed. 
[0073] On the other hand, with a RAID controller hav- 
ing a cache memory for managing the addresses of read 
and write data in the disk and having a write back cache 
function, if a write request for the same address is re- 
peated, the second and subsequent writes are often 
completed in a short time. This is because the last data 
written to the write back cache has been stored and be- 
cause if data is written to the same address, the data on 
the cache has only to be changed. Thus, no actual write 
to the disk is executed. According to the policy of the 
RAID controller toward cache control, that cache areas 
is later written to the disk and then released. If, however, 
a write to the same address is periodically repeated at 
short intervals, that cache area is constantly obtained to 
enable relatively fast accesses. 
[0074] In a second variation of the embodiment, this 
characteristic is used to increase the speed of the write 
process for a logical address log. For example .( 1 ) a plu- 
rality of areas (logical-address log areas) are provided 
to which log information on the logical addresses of data 
corresponding-.to data writes to one stripe is saved. (2) 
If a new write target data stripe is to be assigned, one 
of the currently out-of-use "logical-address log areas" 
which was moit recently used is selected and used as 
an area to which logical-address log information is writ- 
ten when data are written to the stripe. (3) After all the 
data have been written to the target data stripe, the 
blocks to which the data have been written are regis- 
tered with the TAG information of the target data stripe. 
Once all the blocks have been registered, the "logical- 
address log area" having been used is released. 
[0075] By managing the "logical-address log area" on 
the basis of the LRU method as described above, the 
logical-address log information is written to a limited 
fixed area (or a plurality of such areas) on the disk. Con- 
sequently, writes to the logical-address area become 
faster and are thus completed earlier. 

(Third Variation of the Embodiment) 

[0076] Now. a third variation of the embodiment will 
be described. 

[0077] FIG. 7 is a flow chart showing an operation of 
the third variation of the embodiment. 
[0078] In the third variation of the embodiment, a plu- 
rality of write requests are written at a time. This is 
achieved by using a variable (Outstanding Req Count) 
managing the number of outstanding write requests be- 
ing processed by the RAID speed-up driver 100. The 
variable Outstanding R q Count is assumed to be ini- 



tialized to "0". 

(0079] tnFIG. 7. upon receiving the write reque st Reqi 
from the OS Hie system 50 (step S701 ). the RAID speed- 
up driver 1 00 registers the write request Reqi to the var- 

5 iable Outstanding Req Count (step S702). That is. the 
content of the variable Outstanding Req Count (counter 
value) is incremented by one. Then, it is checked wheth- 
er or not the variable Outstanding Req Count is larger 
than a certain constant "A" (step S703). If the variable 

10 is smaller than the constant "A", the process proceeds 
to step S704. where an I/O request for the write request 
Reqi is issued as shown in FIG. 4 (step S407 in FIG. 4). 
[0080] On the other hand, if the variable Outstanding 
Req Count is equal to or larger than the constant "A". 

is the write request Reqi is kept pending. The write request 
Reqi (a plurality of write requests can be registered) is 
placed in a pending request queue Plist, and when the 
following conditions are met, all the write requests are 
written at a time: 

20 

(Conditions for the issuance of an I/O requests for 
' pending write requests) 



[0081] 



25 



a) The number Pcount of pending requests Reqi is 
equal to or larger than a constant "B" (check at step 
S707). 

b) The total of the data sizes Psize of the pending 
30 requests Reqi is equal to or larger than a constant 

M C" (check at step S708). 

c) The total of the data sizes Psize of the pending 
requests Reqi equals or exceeds the size of the re- 
maining empty area of the current write target data 

35 stripe (check at step S709). 

[0082] A timer function TimerRoutine () is registered 
for a service of the OS (Operating System), and exe- 
cutes, when a time and a function are designated, the 

40 designated function after the designated time has 
passed. In this case, for example, the command "exe- 
cute the designated function TimeRoutine () 30 ms later" 
is registered in the OS. A process executed by the timer 
function TimerRoutine () and set at step S710 is shown 

45 in FIG. 8. 

[0083] In FIG. 7. the following variables are used for 
management in order to determine the above described 
conditions a) to c). 

Number of pending requests ... Pcount 
so Total of the data sizes of pending requests . . . Psize 
[0084] If any of the above described conditions a) to 
d) is met (Yes at step S707, Yes at step S708, Yes at 
step S709. or step S710). the variables Pcout and Psize 
are cleared to zero (steps S711 and S712 and steps 
55 S801 and S802 in FIG. 8). It is further checked whether 
or not the timer function Timer Routine () has been set 
(step S713). If this function Timer Routine () has been 
set. it is reset (step S714). Then, the write requests in 
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the pending list PList are set as one write request WReq 
to execute a write process according to the procedure 
at step S407 in FIG. 4 (step S715 and step S803 in FIG. 
8). 

[0085] In this case, if data are written to the write tar- 5 
get data stripe (for example, the stripe 34) in response 
to the write request WReq. into which the requests con- 
nected to the pending list PList are grouped, then in- 
stead of individually writing the data of the plurality of 
write requests constituting the write request WReq. the 10 
data of the one write request WReq are written at a time. 
Further, for writes to the logical-address log area 18b2. 
the data of the entire write request WReq are written at 
a time. A process of completing these writes is shown 
in FIG. 9. '5 
[0086] In FIG. 9, for the write request Reqi, the RAID 
speed-up driver 100 checks whether or not all the logi- 
cal-address log numbers have been written once the da- 
ta write has been completed (I/O completing process 1) 
(step S901). and checks whether or not the data has 20 
been written once the logical-address log has been 
completely written (I/O completing process 2) (step 
S910). If these writes have been completed, the RAID 
speed-up driver 100 determines that the data and logi- 
cal-address log for the original write request Reqi have 25 
been completely written, and notifies the OS file system 
50 of the I/O completion (step S902), The RAID speed- 
up driver 1 00 subtracts the value for the completed write 
request Reqi from the value of the variable Outstanding 
Req Count and records the result of the subtraction as 30 
a new value of the variable Outstanding Req Count, thus 
completing the process (step S903). 
[0087] An example of a specific operation of the third 
variation of the embodiment will be described with ref- 
erence to FIGS, 10A. 108 and 11. In the example of op- 35 
eration described below, a large number of write re- 
quests Reqi from the OS file system 50 concurrently ar- 
rive at the RAID speed-up driver 100. FIG. 10A shows 
that four Reqi to Req4 of these write requests arrive at 
the RAID speed-up driver 100. At this time, the RAID *o 
speed-up driver 1 00 has already been processing a plu- 
rality of write requests. When these request arrive, the 
condition Outstanding Req Count > A has been met. 
That is, the result of the determination at step S703 in 
FIG. 7 is affirmative. <*5 
[0088] FIG. 1 0A shows a write size and a logical block 
number for each of the write requests Reqi to Req4. In 
this case, the RAIO speed-up driver 100 manages each 
2 KB of data, and the write addresses from the OS file 
system 50 are shown as the logical block numbers of so 
the write targets. For example, a logical block number 
1 00 means that this is a write target address designated 
by the OS file system and the address (=the offset on 
the disk) is 100 a 2 KB = 200 KB. Further, the write re- 
quest Reqi requires 4KB data to be written to the logical 55 
address number 100 of the write target, but since the 
RAID speed-up driver 100 manages each 2 KB of data, 
it is understood that the request Reqi requires the log- 



ical address numbers V = 100 and 101 to be assigned. 
Likewise, since the request Req 2 designates 6 KB. it 
requires the logical address numbers V = 76. 77. and 
78 to be assigned. Since the request Req3 designates 
2 KB. it requires the logical address number V =; 60 to 
be assigned. Since the request Req4 designates 8 KB, 
it requires the logical address numbers V = 79. 80. 81, 
and 82 to be assigned. Then, it is assumed that the con- 
stant B is four and that the condition Pcount 2 B at step 
S707 has been established. 

[0089] It is assumed that the pending request list PList 
initially contains no write request and that the write re- 
quests Reqi to Req4 arrive in this order. These requests 
are sequentially stored in the pending request list PList. 
and are arranged as shown in FIG, 10B when the write 
request Req4 arrives. 

[0090] At this time, the condition at step S707 in FIG. 
7 is established, and the control shifts to step S711, At. 
step S7 1 5, the four write requests Req 1 to Req4 (20 KB 
in total) in the pending request list PList are written to 
the disk 180 as the one write request WReq according 
to the procedure shown in FIG. 4. At this time, if the 
20-KB data from the write request WReq do not fit in the 
remaining empty area of the write target data stripe (for 
example, the target 34). that si2e of data (for example, 
10 KB) which fits in the empty area is separated and 
processed. The remaining 10-KB data are grouped into 
another write request Wreq. for which an I/O request is 
then issued with respect to another stripe with an empty 
area. 

[0091] FIG. 11 shows the corresponding write to the 
disk 180. The write requests Reqi to Req4 from the file 
system are originally four separate requests, but since 
the write target addresses constitute the one continuous 
data stripe 34, these requests are grouped. into the one 
write request WReq for the write process. Similarly, the 
logical address information V = 100, 101. 76, 77, ... 82 
is written, via one write I/O request, the logical-address 
log area 18b2, corresponding to the data for which the 
batch write has been issued, 

[0092] As a result, although, in the prior art; four I/O 
requests are required for each of the data write to the 
write target data stripe 34 and the write of logical-ad- 
dress log information to the logical-address log area 
18b2. all these writes can be executed with only two 
write I/O requests by storing the data and logical-ad- 
dress log information in the pending request list PList 
before the write process. Consequently, the number of 
I/O requests issued can be reduced to increase the write 
speed on the basis of the collective write. In the above 
described example, it is assumed that the constant B is 
four and that the condition Pcount t B has been estab- 
lished. The above description, however, also applies to 
the case where any of the above described other con- 
ditions b), c). and d) has been established (step S708. 
S709.orS710). 

[0093] According to the third variation of the embodi- 
ment, by using a counter to manage the number of re- 
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quests transmitted by the upper file system and which 
are being processed by the RAID speed-up driver at a 
certain time and for which the completion notification 
has not been issued yet (specifically, requests that have 
entered a disk control system driver and for which the 
completion notification has not been issued yet), the 
"number of write I/O requests simultaneously issued" at 
an arbitrary time can be obtained. If the "number of write 
I/O requests simultaneously issued"* is equal to or larger 
than a predetermined value A, then for subsequent write 
requests, writes of corresponding data and logical-ad- 
dress log information are started after a fixed time has 
passed. Then, if (1) a fixed number of write requests or 
more newly arrive. (2) the total of the write sizes of write 
requests that have arrived exceeds a fixed size, (3) the 
total of the write sizes of write requests that have arrived 
exceeds the size of the remaining empty area of the cur- 
rent write target stripe, or (4) a fixed time has passed, 
all the data of the pending write requests are written to 
the target stripe at a time. All of the logical-address log 
information for the simultaneously written data blocks 
(plural) is written to the logical-address log area at a 
time. 

[0094] As a result, the plurality of write requests are 
converted into the one data write process and the one 
write process for the logical-address log information. 
The start of each data request is delayed, but the 
number of write processes decreases, while the write 
size increases. Consequently, the total overhead of 
writes to the disk decreases to thereby improve the 
throughput of the write process. 

(Fourth Variation of the Embodiment) 

[0095] Next a fourth variation of the embodiment of 
the present invention will be described. 
[0096] In the fourth variation of the embodiment, not 
only the logical-address log information but also the 
write data size and a checksum are recorded in the log- 
ical-address log area 18b2. 

[0097] FIG, 12 shows how. for write data a of the write 
request Req1, the (1) logical addresses V = 100 and 
101, a (2) write size Size = 4 KB, and a (3) checksum 
ChkSum = 0x16f92aab are written to the logical-ad- 
dress log area 18b2. corresponding to the write target 
data stripe 34. The checksum ChkSum is the value of 
the result of the summation of every 4 bytes of the write 
data a. When the write data size and the checksum are 
thus recorded in the logical-address log area, the sys- 
tem can recover from failures more easily 
[0098] That is. if the system fails while data are being 
written to the data area 1 8c of the disk 1 80, it is checked 
whether or not the checksum value being written to the 
logical-address log area 18b2 of the stripe undergoing 
the write process equals the checksum value deter- 
mined from the data of the data area 1 8c. If these check- 
sum values are equal, then it is determined that the data 
write has been completed, and the data are treated as 
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valid. Then, the remaining part of the process (registra- 
tion in the address mapping table or the like) is executed 
to complete the write process. If the checksum values 
are not equal, it is determined that the write has not been 
5 completed, and the data are discarded. 

(Fifth Variation of the Embodiment) 

[0099] Now. a fifth variation of the embodiment will be 

10 described. 

[0100] In the fifth variation of the embodiment, the 
write data a are written not only to the write target data 
stripe 34 but also to the logical-address log area 18b2 
together with the logical-address log information. 

is [0101] FIG. 13 shows the organization of data written 
the logical-address log area 18b2 according to the fifth 
variation of the embodiment. That is. the (1) logical ad- 
dresses V = 100 and 101. the (2) write size Size = 4 KB. 
the (3) checksum ChkSum = 0x16f92aab. and the (4) 

20 write data a (4 KB) are written to the logical-address log 
area 18b2. The items (1) to (4) are actually processed 
through one I/O request, so that the overhead can be 
reduced. 

[0102] FIG. 14 is a flow chart showing the procedure 
25 of an operation performed by the RAID speed-up driver 
100 according to the fifth variation of the embodiment. 
The OS file system 50 inputs the request Reqi (i = 1 . 2. 
3, 4. .. .). which is composed of write block data B0 to Bn 
starting with the logical address addri and each consist- 

30 ingof 2KB(stepS1401). 

[01 03] It is checked whether or not any data stripe with 
an empty area has been assigned to the write request 
Reqi as the write target stripe(step S1402). If no such 
a data stripe has been assigned, one of the free stripes 

35 on the disk is selected (step S1403). Then, a buffer WB 
of the same size as this data stripe is provided on the 
main memory 13 (step S1404). The buffer WB is provid- 
ed on the write buffer 1 71 . The ID of the assigned stripe 
is defined as IDk and is written to the stripe ID log area 

40 I8b1 of the data managing area 18b as the next entry 
(step S1405). The IDs of stripes selected as the write 
target are recorded in the stripe ID log area 18b1 in a 
time series manner (31, 34. ,..). 
[01 04] Then, the logical-address log area 1 8b2 of the 

45 managing data area 18b is assigned (step S1406). Data 
blocks Bj to which no write area has been assigned are 
provided with as large part of the next empty area of the 
write target data stripe (for example, the stripe 34) as 
possible (step S1407). 

so [0105] The RAID speed-up driver 100 copies the data 
of the size ensured at step S140.7. to the buffer WB (step 
S1408). The buffer WB is a write buffer provided on the 
main memory 13 and having a size equal to the stripe 
area. 

55 [0106] Then, at step S1409. (1 ) logical-address infor- 
mation and (2) write data are written to the logical-ad- 
dress log area 1 8b2 via one write I/O request, as shown 
in FIG. 13. If the data stripe has no empty area and all 
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the data written to the stripe fit in the buffer WB (step 
S 14 10). an I/O request is issued for a batch write of the 
contents of the buffer WB to the write target data stripe 
34, and TAG information is written to the write target da* 
ta stripe 34 of the disk 180. Then, the TAG information 5 
is registered in the AMT cache 172. which is then up- 
dated (step S1412). On the other hand, at step S1410. 
if the write target data stripe has some empty area, the 
processing at steps S1411 and S 141 2 is omitted, and 
the process proceeds to step S1413. At step S1413. it 10 
is determined whether or not all the I/O requests for the 
data blocks of the write request Reqi have been issued. 
If there still remain data to write, the process returns to 
step S1402 to assign a new stripe and make write re- 
quests as described above. 15 
[0107] In this variation of the embodiment, one data 
block of the write target data stripe 34 is set as the TAG 
area. Accordingly, at step S 1411, if the write to the buffer 
WB has been completed with the TAG data previously 
set on the buffer WB. the write of the TAG data is simul- 20 
taneously completed. 

[0108] FIGS. 15Aand 156 illustrate a flow chart show- 
ing the operations of the I/O completing processes at 
steps S1409 and S1411 in FIG. 14. FIG. 15A shows the 
completing process (I/O completing process 1) execut- 25 
ed on the logical address log area 18b2 at step S1409. 
In this case, once the write for the write request Reqi 
have been completed, the OS file system 50 is notified 
of the "completion" (step S1501). On the other hand. 
FIG. 15B shows the write completing process (I/O com- 30 
pieting process'2) executed on the write data target data 
stripe at step S*1411. In this case, once the write to the 
target data stripe has been completed, the buffer WB is 
released (step S1 51 0), and the logical-address log area 
18b2 is further released to complete the process (step 35 
S1511). 

[0109] The fifth variation of the embodiment is com- 
pared with the above described first variation of the em- 
bodiment. In the fifth variation of the embodiment, once 
the logical address log information and the write data a «*o 
have been completely written to the logical-address log 
area 18b2, the OS file system 50 can be notified of the 
completion. That is, a quicker response is provided to 
the OS file system. On the other hand, in the first varia- 
tion of the embodiment, the completion notification is not «*s 
executed before the two write I/O requests have been 
completed, that is. the writes to the logical-address log 
area 1 8b2 and the write target data stripe 34 have been 
completed. 

[0110] On the other hand, if each write data is consid- 50 
ered. then in the fifth variation of the embodiment, the 
two writes are executed, including the one to the logical- 
address log area 1 8b2 and the one to the data stripe 34. 
The duplicate write of the write data a to the logical-ad- 
dress log area 18b2 may impose heavy burdens de- 55 
pending on the characteristics of the RAID controller 16 
even if Variation 2 of the embodiment is used to effec- 
tively utilize the cache so that the data is written to the 
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same area. Thus, the selection of either the first or fifth 
variation of the embodiment depends on the character- 
istics of the RAID controller 16 used; one of the varia- 
tions which consequently exhibits better performance is 
selected. 

(Sixth Variation of the Embodiment) 

[0111] If the system fails while the RAIO speed-up 
driver 100 is executing a write process, the system may 
be started up before the data, logical address log infor- 
mation , or the like is not all written. In such a case, blocks 
that have been completely written can be registered 
back in the TAG area corresponding to the write target 
data stripe and to the AMT cache 1 72 to enable the reg- 
istered data to be subsequently accessed. In this case, 
it is determined that the data written to the disk 18 are 
correct if equality is detected between the checksum in- 
formation for the data saved as the logical-address tog 
information and the checksum recalculated from the tar- 
get data area. In this case, it must be ensured that the 
value saved to the logical-address tog area 18b2 (and 
including the data checksum value) is "really correct". 
Thus, for example, for each entry of the logical-address 
log area 1 8b2. 

(x) when the entry is written, a particular string (sig- 
nature) is placed at the beginning and end of the 
area of the entry so as to be checked upon a read, 
(y) the checksum of the entire entry is calculated 
and stored, or 

(z) the target stripe IO is to be recorded and checked 
to see whether or not it matches with the stripe ID 
recorded in the leading block of the logical-address 
log area 18b2. 

In this manner, the data integrity of each entry may be 
checked. 

[0112] The above described process is shown in the 
flow chart of FIG. 16. The write method is assumed to 
be based on the first variation of the embodiment. 
[0113] At step S1601. the logical-address log area 
1 8b2 is referenced to check whether or not any part of 
the logical-address log area 1 8b2 is in use. Whether the 
logical-address log area 18b2 is in use can be deter- 
mined by. for example, providing a flag (Valid) at the 
header of the logical-address log area 18b2. indicating 
that this area is in use. and setting this flag (Valid = "1") 
while the area 18b2 is in use, and resetting it (valid = 
"0") when the area 18b2 is to be released, as shown in 
FIGS. 17Ato 17D. FIGS. 17A to 17D show an example 
in which four logical-address log areas 1 8b2 are provid- 
ed, wherein logical address logs LA1 and LA2, having 
their in-use flags set (Valid = "1"), can be determined to 
be in use. Correspondingly, LA 1 and LA 2 are recorded 
in an in-use List S. 

[0114] Now, at step $1602, the above described list 
S is sorted on the basis of the value of the time stamp 
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TS at the header section. The list is consequently sorted 
in a time series manner. In the example of FIGS. 17Ato 
17D. since the time stamp TS (= 11679) of the logical- 
address log LA2 is smaller than that (= 11680) of the 
logical-address log LA1 (that is. the logical-address log 
LA2 is older than the logical-address log LA1), the list S 
contains the logical-address logs LA2 and LA1 in this 
order as a result of the sorting at step S1602. 
[0115] Then, at step S1 603. it is determined that some 
of the logical-address log areas 18b2 (elements) are in 
use. and at step S1604, an element with the oldest time 
stamp TS is selected from the list S (the logical-address 
log LA2 is first selected). At step S1605. an area of the 
main memory 13 which is provided for a TAG image 
(provided in the write buffer 171) is initialized so that "all 
the blocks of the target data stripe are invalidated". 
Then, at step S1606, a variable k is set at zero. At step 
S1607 and subsequent steps, for each of the valid en- 
tries in the logical-address log LA2, data checksum val- 
ues are read out and checked to see whether or not they 
are equal to the checksums actually written to the stripe. 
Whether or not the entry is valid may be determined as 
described in the above described (x) to (z). 
[0116] In the example of the logical-address log LA2 
in FIGS. 17A to 17D. it is assumed that entries E0. E1. 
E2. ... are valid. The entry E0 is noted. The logical ad- 
dresses V = 463 and 464 and the checksum CS 
(ChkSumO = CSO and CS1 ) of each data block are reg- . 
istered in two physical blocks relative to the header of 
the logical-address log. At step S1 608, the first and sec- 
ond data blocks of the data stripe 34 (corresponding to 
the logical-address log LA2) are read out to determine - 
checksums ChkSuml = CSO' and CSV. 
[0117] At step S1609. the checksums ChkSumO are 
compared with the checksums ChkSuml (CSO with- 
CS0' and CS1 and CSV). If these checksums are deter- ■ 
mined to be equal, that block is determined to be correct, 
and at step S 1610. its logical block number is registered 
in the TAG area of the data stripe 34. FIG. 18 shows 
how the TAG area appears after the five data blocks of 
the entries E0 and E1 have been processed. In this 
case, it is assumed that the physical block number of 
the leading block of the data stripe 34 is 2.000. Then, at 
step S1611. the variable k is set at k + 1. and at step 
S1612. comparisons are made for k ^ N (N is the 
number of data blocks constituting the data stripe). 
[0118] Once the processing at steps S1607 to S1612 
has been executed on all the entries of the logical-ad- 
dress log LA2, the TAG information is completed. Then, 
at steps S1613 and S1614. the TAG information is writ- 
ten to its original address (in the stripe 34. the last phys- 
ical block), and the same contents are registered in the 
AMT cache 172, which is then updated. Finally, at step 
S1615. the flag indicating that the logical-address log 
LA2 is in use is reset (Valid = "0"). thus completing the 
process of recovering the logical-address log LA2. By 
executing a similar process on the logical-address log 
LA1 , the write of the logical-address logs LA1 and LA2, 
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which was hindered by the system failure, can be com- 
pleted. 

[01 19] The TAG is written to the TAG area in FIG. 18 
by creating TAG data using the logical-address values 
of valid data and a logical-address value indicative of an 
invalid address. The invalid-logical-address value may 
be. for example, a value absent from that partition. 
[0120] As described above, the present embodiment 
is based on the write method of its first variation (the 
data of each request are written to the target stripe), but 
the write method of its sixth variation (the data are first 
written to the logical-address log area, and then to the 
target stripe on the basis of the unit of the buffer WB 
once the buffer WB becomes full) can achieve a similar 
process with the following changes: 

(i) At step S1 608. the checksum ChkSuml is deter- 
mined from the data written to the logical-address 
log area 18b2. 

(ii) The write data is written in the target stripe be- 
tween steps S1609 and S1610, 
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A disk control system that responds to a write re- 
quest from an upper file system (50) to translate log- 
ical addresses into physical ones and then contin- 
uously write write-requested data to a data stripe as 
a write area composed of a plurality of disk appara- 
tuses (180), the system characterized by compris- 
ing: 

means (100), in response to said write request, 
for sequentially writing data blocks on empty ar- 
eas (34. 40. 51) of an assigned target data 
stripe of data areas (18b) provided on said plu- 
rality of disks (180), in such a manner that at 
least one data block is written at a time; 
means (100), in response to said write request, 
for writing said logical addresses from said up- 
per file system (50) on data managing areas 
. (18b) provided on the plurality of disks (180), 
as logical-address log information; and 
means (100) for notifying, in response to the 
write request from said upper file system (50). 
the upper file system (50) that the write has 
been completed, after said data and said logi- 
cal-address log information have been com- 
pletely written. 

The disk control system according to claim 1. char- 
acterized in that an I/O request is issued for one 
write request for the plurality of data blocks trans- 
ferred in said write request by said upper file system 
(50). and said plurality of data blocks are simulta- 
neously written to the empty areas (34. 40. 51) of 
said write target data stripe. 
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3. The disk control system according to claim 1, char- 
act rized in that said logical-address log informa- 
tion is written on at least one logical-address log ar- 
ea (18b2) provided in said data managing area. 

4. The disk control system according to claim 3, char- 
acterized in that if the data managing area is com- 
posed of said plurality of logical-address log areas 
(18b2), one of said logical-address log areas which 
is not currently used and which was most recently 
used is used. 

5. ' The disk control system according to claim 3. char- 

acterized in that said logical-address log areas 
(18b2) are fixedly provided on said disk (180). 

6. The disk control system according to claim 3, char- 
acterized in that if the system is shut down before 
said data write process is completed, said logical- 
address log information recorded in said logical-ad- 
dress log area (18b2) is used to execute data re- 
covery after system reboot. 

7. The disk control system according to claim 1 , char- 
acterized in that said data managing area (18B) 
has a stripe ID log area (18b1) in which a written 
stripe ID and a stripe ID to be written are recorded. 

8. The disk coMro! system according to claim 1 . char- 
acterized in that part of said write target data stripe 
is used as an area (TAG) in which tag information 
is written, and logical-address information that has 
been written on said logical-address log area ( 1 8b2) 
when data for all the areas of said write target data 
stripe have been provided is written as said tag in- 
formation. 

9. The disk control system according to claim 8. char- 
acterized in that an address mapping table cache 
(172) that stores correspondences between logical 
addresses and physical addresses relating to write 
data is provided in part of the main memory (13). 
and when the data and the tag information have 
been completely written on said write target data 
stripe, said address mapping table cache (172) is 
updated on the basis of said tag information. 

10. A disk control system that responds to a write re- 
quest from an upperfile system (50) to translate log- 
ical addresses into physical ones and then contin- 
uously write write-requested data on a data stripe 
as a write area composed of a plurality of disk ap- 
paratuses (180). the system characterized by 
comprising: 

means (100) for writing a plurality of block data 
corresponding to a plurality of write requests, 
to a write buffer (171 ) provided in a main mem- 



ory (13): 

data write means (100) for responding to said 
plurality of write requests to simultaneously 
write all the plurality of data blocks stored in 

5 said write buffer on empty areas (34, 40. 51) of 

an assigned target data stripe of data areas 
(18c) provided on said plurality of disks (180): 
log write means (100) for simultaneously writ- 
ing said logical addresses from said upper file 

10 system (50) corresponding to said plurality of 

block data, on data areas (18c) provided on 
said plurality of disks (180). as logical-address 
log information; and 

means (100) for notifying, with respect to the 
is write requests from said upper file system (50), 

the upper file (50) system that the writes have 
been completed, after said data and said logi- 
cal-address log information have been com- 
pletely written. 

20 

11. The disk control system according to claim 10, 
characterized in that the number of said write re- 
quests is managed as a variable, and if said variable 
is larger than a predetermined value, said write re- 

25 quests are retained as pending requests and a 
batch write process is executed using said data and 
said log write means (100), and if said variable is 
smaller than the predetermined value, a process. of 
, writing each of said write requests is executed. 

30 

1 2. The disk control system according to claim 11 . char- 
acterized in that whether or not to execute said 
batch write process is determined when the number 
of said pending write requests exceeds a predeter- 

35 mined number, or the total of the data sizes of said 
pending requests exceeds the size^of a remaining, 
empty area of said write target data stripe, or a time 
during which the pending requests have been kept 
pending exceeds a predetermined value. 

40 

13. The disk control system according to claim 10. 
characterized in that said logical-address log in- 
formation is written on at least one logical-address 
log area (1 8b2) provided in said data managing ar- 

'5 ea(18b). 

14. The disk control system according to claim 10. 
characterized In that if the data managing area is 
composed of said plurality of logical-address log ar- 

50 eas (18b2). one of said logical-address log areas 
which is not currently used and which was most re- 
cently used is used. 

15. The disk control system according to claim 10, 
55 characterized in that said logical-address tog are- 
as (18b2) are fixedly provided on said disk (180). 

16. The disk control system according to claim 15, 
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characterized in that if the system is shut down 
before said data write process is completed, said 
logical-address log information recorded in said log- 
ical-address log area ( 18b2) is used to execute data 
recovery after system reboot. 

17. The disk control system according to claim 10, 
characterized in that said data managing area 
( 1 8b) has a stripe ID log area (1 8b 1 ) in which a writ- 
ten stripe ID and a stripe ID to be written are record- 
ed. 

18. The disk control system according to claim 10. 
characterized in that part of said write target data 
stripe is used as an area (TAG) in which tag infor- 
mation is written, and logical-address information 
that has been written to said logical-address log ar- 
ea (18b2) when data for all the areas of said write 
target data stripe have been provided is written as 
said tag information. 

19. The disk control system according to claim 18, 
characterized in that an address mapping table 
cache (172) that stores correspondences, between 
logical addresses and physical addresses relating 
to write data is provided in part of the main memory 
(13), and when the data and the tag information 
have been completely written on said write target 
data stripe, said address mapping table cache (172) 
is updated on the basis of said tag information. 

20. A disk coritrol system that responds to a write re- 
quest from an upper file system (50) to translate log- 
ical addresses into physical ones and then contin- 
uously write write-requested data on a data stripe 
as a write area composed of a plurality of disk ap- 
paratuses (180). the system characterized by 
comprising: 

means (100), in response to said write request 
for sequentially writing data blocks to empty ar- 
eas (34,40,51) of an assigned target data stripe 
of data areas (18c) provided on said plurality of 
disks (180). in such a manner that at least one 
data block is written at a time: 
means (100). in response to said write request 
for writing said logical addresses from said up- 
per file system (50). write data sizes, and 
checksums of write data to logical-address ar- 
eas (18b2, FIG. 12) provided on said plurality 
of disks (180). as logical-address log informa- 
tion; and 

means (100) for notifying, in response to the 
write request from said upper file system (50). 
the upper file system (50) that the write has 
been completed, after said data and said logi- 
cal-address log information have been com- 
pletely written. 



21. The disk control system according to claim 20. 
characterized In that if the system fails during a 
write, it is checked whether or not a checksum value 
•being written on said, logical-address log area 

5 (1 8b2) of a stripe being subjected to the write proc- 
ess equals a checksum value determined from the 
data written on said data area (18c), and if these 
checksum values are equal, the data are treated as 
valid, and if these checksum values are not equal. 

jo it is determined that the data write has not been 
completed and the data are discarded. 

22. A disk control system that responds to a write re- 
quest from an upper file system (50) to translate log- 

15 teal addresses into physical ones and then contin- 
uously write write-requested data on a data stripe 
as a write area composed of a plurality of disk ap- 
paratuses (180). the system characterized by 
comprising: 

20 

means (100). in response to said write request, 
for writing said logical addresses from the up- 
per file system (50). write data sizes, and 
checksums of data written to a logical-address 

25 log area (18b2) provided on said plurality of 

disks (180), as logical-address log information: 
means (100). in response to said write request, 
for sequentially writing data blocks on empty ar- 
eas (34,. 40. 51) of an assigned target data 

30 stripe of data areas (18c) provided on the plu- 

rality of disks (180). in such a manner that at 
least one data block is written at a time; and 
means (100) for notifying, in response to the 
write request from said upper file system (50). 

35 the upper file system that the write has been 

completed, after said logical-address log infor- 
mation has been completely written in the log- 
ical-address area (18b2). 

40 23. The disk control system according to claim 22. 
characterized in that a time required to respond to 
the upper file system is reduced by writing said data 
blocks on said logical address log area (18b2). 

45 24. A disk control system that responds to a write re- 
quest from an upper file system (50) to translate log- 
ical addresses into physical ones and then contin- 
uously write write-requested data on a data stripe 
as a write area composed of a plurality of disk ap- 

so paratuses (180). the system characterized by 
comprising: 

means (100). in response to said write request, 
for recording flags indicative of validity or inva- 
55 lidity, stripe ID numbers, and write time stamps 

for final data in header sections of logical ad- 
dress log areas (18b2) provided on the plurality 
of disks (1 80). and write time stamps for at least 
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one block data processed by the write request, 
at least one logical address, and at least one 
checksum, to entry sections of said logical-ad- 
dress log areas (18b2) as logical-address log 
information; 

means (100). in response to said write request, 
for sequentially writing data blocks on empty ar- 
eas (34, 40. 51) of an assigned target data 
stripe of data areas (18c) provided on said plu- 
rality of disks (180), in such a manner that at 
least one data block is written at a time: and 
means, if the system fails during a write, for 
checking whether or not a checksum value be- 
ing written on said logical-address log area 
(18b2) of a stripe being subjected to the write 
process and for which a valid flag has been set 
equals a checksum value determined from the 
data written on said data area (18c), and for 
treating the data as valid if these checksum val- 
ues are equal, while determining that the write 
has not been completed and discarding the da- 
ta if these checksum values are unequal. 

25. The disk control system according to claim 24, 
characterized in that if said checksums are equal, 
a list of correspondences between physical and log- 
ical addresses is recorded in a TAG area (TAG) of 
said data stripe, and the correspondence list (FIG. 
1 8) in the TAG area is reflected in the address map- 
ping table of the main memory. 

9 

26. A disk control method that responds to a write re- 
quest from an upper file system (50) to translate log- 
ical addresses into physical ones and then contin- 
uously write write-requested data to a data stripe as 
a write area composed of a plurality of disk appara- 
tuses (180). the method characterized by compris- 
ing: 

responding to said write request to sequentially 
write data blocks on empty areas (34. 40, 51) 
of an assigned target data stripe of data areas 
(18c) provided on said plurality of disks (180), 
in such a manner that at least one data block 
is written at a time: 

responding to. said write request to write said 
logical addresses from said upper file system 
(50) on data managing areas (18b) provided on 
the plurality of disks (180), as logical-address 
log information; and 

notifying, in response to the write request from 
said upper file system (50), the upper file sys- 
tem (50) that the write has been completed, af- 
ter said data and said logical-address log infor- 
mation have been completely written. 

27. A disk control method that responds to a write re- 
quest from an upper file system (50) to translate log- 



ical addresses into physical ones and then contin- 
uously write write-requested data on a data stripe 
as a write area composed of a plurality of disk ap- 
paratuses (180). the method characteriz d by 
s comprising: 

writing a plurality of block data corresponding 
to a plurality of write requests, in a write buffer 
(171) provided in a main memory (13); 

10 responding to said plurality of write requests to 

simultaneously write all the plurality of data 
blocks stored in said write buffer (1 71 ) on empty 
areas (34. 40. 51) of an assigned target data 
stripe of data areas (18c) provided on said plu- 

is rality of disks (180); 

simultaneously writing all said logical address- 
es from said upper fHe system (50) correspond- 
ing to said plurality of block data, on data man- 
aging areas (18b) provided on said plurality of 

20 disks (180). as logical-address log information; 

and 

notifying, with respect to the write, requests 
from said upper file system (50), the upper file 
system (50) that the writes have been complet- 
es ed. after said data and said logical-address log 
information have been completely written. 

28. A disk control method that responds to a write re- 
quest from an upper file system (50) to translate log- 

30 ical addresses into physical ones and then contin- 
uously write write-requested data on a data stripe 
as a write area composed of a plurality of disk ap- 
paratuses (180). the method characterized by 
comprising: 

35 

responding to said write request to sequentially 
write data blocks on empty areas (34, 40, 51) 
of an assigned target data stripe of data areas 
(18c) provided on said plurality of disks (180), 
<o in such a manner that at least one data block 

is written at a time; 

responding to said write request to write said 
logical addresses from said upper file system 
(50). write data sizes, and checksums of write 

45 data on a logical-address log area (18b2) pro- 

vided on said plurality of disks (1 80). as logical- 
address log information; and 
notifying, in response to the write request from 
said upper file system (50), the upper file sys- 

so tern (50) that the write has been completed, af- 

ter said data and said logical-address log infor- 
mation have been completely written. 

29. A disk control method that responds to a write re- 
55 quest from an upper file system (50) to translate log- 
ical addr sses into physical ones and then contin- 
uously write write-requested data on a data stripe 
as a write area composed of a plurality of disk ap- 
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paratuses (180). the method characterized by 

comprising: 

responding to said writ request to record flags 
indicative of validity or invalidity, stripe ID num- 5 
bers. and write time stamps for final data in 
header sections of logical address log areas 
(18b2) provided on the plurality of disks (180). 
and-write time stamps for at least one block da- 
ta processed by the write request, at least one '0 
logical address, and at least one checksum, on 
entry sections of said logical-address log areas 
(18b2) as logical-address log information; 
responding to said write request to sequentially 
write data blocks on empty areas (34. 40. 51) is 
of an assigned target data stripe of data areas 
(1 8c) provided on said plurality of disks (180). 
in such a manner that at least one data block 
is written at a time: and 

if the system fails during a write, checking 20 
whether or not a checksum value being written 
on said logical-address log area (18b2) of a 
stripe being subjected to the write process and 
for which a valid flag has been set is equal to a 
checksum value determined from the data writ- 2$ 
ten on said data area (18c), and treating the da- 
ta as valid if these checksum values are equal, 
while determining that the write has not been 
completed and discarding the data if these 
checksum values are unequal. 30 
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